A Symbolic Corpus-based Approach to Detect and Solve the Ambiguity of Discourse Markers
نویسنده
چکیده
At present, discourse parsing is an important research topic. Rhetorical Structure Theory (RST) is one of the most popular approaches in this field. In general, discourse parsing includes three stages: discourse segmentation, discourse relations detection and building up rhetorical trees. Different strategies are used when developing discourse parsers. One of the strategies to detect discourse relations is based on symbolic rules that take into account linguistic clues, such as discourse markers. Nevertheless, some discourse markers are ambiguous, that is, they can indicate more than one discourse relation. This fact constitutes a problem when assigning discourse relations automatically. In this paper, a symbolic approach to detect and solve discourse markers ambiguity in Spanish is developed. First, we detect ambiguous discourse markers, using the training corpus of the RST Spanish Treebank. Second, we extract linguistic contexts for these markers. Third, we design linguistic rules to solve the ambiguity of discourse markers. Fourth, we evaluate the rules, using the test corpus of the RST Spanish Treebank. Our approach outperforms the baseline created following the methodology of the state of the art. Therefore, we consider that the results obtained in our experiments are representative and constitute the first step towards the disambiguation of discourse markers senses in Spanish. However, there is room for improvement and the main limitations of the approach are presented. In the future, the rules will be integrated in a discourse parser for Spanish, and several related applications will be developed (automatic summarization and information extraction, among others).
منابع مشابه
STANCE AND ENGAGEMENT DISCOURSE MARKERS IN JOURNAL’S “AUTHOR GUIDELINES”
Over the past decade, there has been an increasing interest in the study of interactional metadiscourse markers in different contexts. However, not much research has been conducted about the discourse of journal author guidelines, especially the use of meta-discourse markers in this genre. Therefore, this corpus-based study had three main aims: 1) to delve deep into the types, frequencies and f...
متن کاملMetadiscourse Markers in a Corpus of Learner Language: The Case of Iranian EFL Learners
Different issues have been probed in learner corpus research since the late 1980s.However, taking the im- portance of meta discourse markers (MDMs) in signposting academic discourse, their use in Iranian EFL learners‟ academic essays is an area of research in need of a more serious analysis. Contributing to this line of investigation, this paper reports a corpus-based study of the use of MDMs i...
متن کاملA Corpus-based Study of Lexical Bundles in Discussion Section of Medical Research Articles
There has been increasing interest in utilizing corpora in linguistic research and pedagogy in recent years. Rhetorical organization of different sections of research articles may appear similar in various disciplines, but close examination may show subtle differences nonetheless. One of the features that has been at the center of attention especially in recent years is the idiomaticity of a di...
متن کاملA New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal
The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...
متن کاملA New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal
The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Research in Computing Science
دوره 70 شماره
صفحات -
تاریخ انتشار 2013